AITopics | maximum probability

2508.00079

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
Asia > Singapore (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.46)

Industry: Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Joung, Youngju, Lee, Sehyun, Choi, Jaesik

Probing Network Decisions: Capturing Uncertainties and Unveiling Vulnerabilities Without Label Information

arXiv.org Artificial IntelligenceMar-12-2025

To improve trust and transparency, it is crucial to be able to interpret the decisions of Deep Neural classifiers (DNNs). Instance-level examinations, such as attribution techniques, are commonly employed to interpret the model decisions. However, when interpreting misclassified decisions, human intervention may be required. Analyzing the attributions across each class within one instance can be particularly laborintensive and influenced by the bias of the human interpreter. In this paper, we present a novel framework to uncover the weakness of the classifier via counterfactual examples. A prober is introduced to learn the correctness of the classifier's decision in terms of binary code - hit or miss. It enables the creation of the counterfactual example concerning the prober's decision. We test the performance of our prober's misclassification detection and verify its effectiveness on the image classification benchmark datasets. Furthermore, by generating counterfactuals that penetrate the prober, we demonstrate that our framework effectively identifies vulnerabilities in the target classifier without relying on label information on the MNIST dataset.

classifier, dataset, prober, (16 more...)

doi: 10.1007/978-981-97-8702-9_21

2503.09068

Country: Asia > South Korea > Daejeon > Daejeon (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Chi, Ta-Chung, Fan, Ting-Han, Rudnicky, Alexander I.

Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation

arXiv.org Artificial IntelligenceNov-15-2023

An ideal length-extrapolatable Transformer language model can handle sequences longer than the training length without any fine-tuning. Such long-context utilization capability relies heavily on a flexible positional embedding design. Upon investigating the flexibility of existing large pre-trained Transformer language models, we find that the T5 family deserves a closer look, as its positional embeddings capture rich and flexible attention patterns. However, T5 suffers from the dispersed attention issue: the longer the input sequence, the flatter the attention distribution. To alleviate the issue, we propose two attention alignment strategies via temperature scaling. Our findings show improvement on the long-context utilization capability of T5 on language modeling, retrieval, multi-document question answering, and code completion tasks without any fine-tuning. This suggests that a flexible positional embedding design and attention alignment can go a long way toward Transformer length extrapolation.

alignment strategy, language model, sequence, (12 more...)

2311.00684

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > Canada > Ontario > Toronto (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.35)

Neural Information Processing SystemsApr-6-2023, 13:53:48 GMT

An LP View of the M-best MAP problem

We consider the problem of finding the M assignments with maximum probability in a probabilistic graphical model. We show how this problem can be formulated as a linear program (LP) on a particular polytope. We prove that, for tree graphs (and junction trees in general), this polytope has a particularly simple form and differs from the marginal polytope in a single inequality constraint. We use this characterization to provide an approximation scheme for non-tree graphs, by using the set of spanning trees over such graphs. The method we present puts the M -best inference problem in the context of LP relaxations, which have recently received considerable attention and have proven useful in solving difficult inference problems.

assignment, map problem, polytope, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.88)

Jecmen, Steven, Shah, Nihar B., Fang, Fei, Conitzer, Vincent

Tradeoffs in Preventing Manipulation in Paper Bidding for Reviewer Assignment

arXiv.org Artificial IntelligenceJul-22-2022

Many conferences rely on paper bidding as a key component of their reviewer assignment procedure. These bids are then taken into account when assigning reviewers to help ensure that each reviewer is assigned to suitable papers. However, despite the benefits of using bids, reliance on paper bidding can allow malicious reviewers to manipulate the paper assignment for unethical purposes (e.g., getting assigned to a friend's paper). Several different approaches to preventing this manipulation have been proposed and deployed. In this paper, we enumerate certain desirable properties that algorithms for addressing bid manipulation should satisfy. We then offer a high-level analysis of various approaches along with directions for future investigation.

artificial intelligence, machine learning, reviewer, (17 more...)

2207.11315

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.95)
Information Technology > Artificial Intelligence > Machine Learning (0.93)

Pawar, Ashish Anil, Warbhe, Ujwal

Optimizing Bayesian acquisition functions in Gaussian Processes

arXiv.org Machine LearningNov-8-2021

Bayesian optimization is a popular optimization technique for optimizing a black box function especially with high dimensions. For a known objective functions, various optimization functions are readily available to choose from. For a black box function, since the true nature of the objective function is unknown, many available optimization techniques including Gradient Descent cannot be applied. For a black box function, various other optimization techniques are available such as Grid Search and Random Search, however, both of these techniques are extremely inefficient and time consuming specially if the objective function is costly to execute. Instead, Bayesian optimization tries to find the global optimum by using a surrogate function to evaluate the real objective function, thus, making the computation much efficient with respect to time or money.

acquisition function, maximum probability, optimization, (12 more...)

2111.0493

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.70)

#artificialintelligenceJun-29-2020, 03:20:04 GMT

Pytorch: Real Step by Step implementation of CNN on MNIST

Here is a quick tutorial on how and the advantages of implementing CNN in PyTorch. We go over line by line so that you can avoid all bugs when implementing! In this article, we will be taking on the task of implementing a Convolutional Neural Network in Pytorch! I really wanted to write on such a topic because of the overwhelming unexplained and bug full implementations that swarm all over the internet and prevent most people to start quickly on their own implementations. Note however that while writing, I do assume that the reader has some basic knowledge in Neural Networks and CNN, if not then see the links on the bottom of the article for better understanding before starting.

artificial intelligence, implementation, machine learning, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Demeter, David, Kimmel, Gregory, Downey, Doug

Stolen Probability: A Structural Weakness of Neural Language Models

arXiv.org Machine LearningMay-5-2020

Neural Network Language Models (NNLMs) generate probability distributions by applying a softmax function to a distance metric formed by taking the dot product of a prediction vector with all word vectors in a high-dimensional embedding space. The dot-product distance metric forms part of the inductive bias of NNLMs. Although NNLMs optimize well with this inductive bias, we show that this results in a sub-optimal ordering of the embedding space that structurally impoverishes some words at the expense of others when assigning probability. We present numerical, theoretical and empirical analyses showing that words on the interior of the convex hull in the embedding space have their probability bounded by the probabilities of the words on the hull.

artificial intelligence, machine learning, natural language, (17 more...)

2005.02433

Country: North America > United States (0.04)

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Macêdo, David, Ren, Tsang Ing, Zanchettin, Cleber, Oliveira, Adriano L. I., Tapp, Alain, Ludermir, Teresa

Distinction Maximization Loss: Fast, Scalable, Turnkey, and Native Neural Networks Out-of-Distribution Detection simply by Replacing the SoftMax Loss

arXiv.org Machine LearningAug-19-2019

Recently, many methods to reduce neural networks uncertainty have been proposed. However, most of the techniques used in these solutions usually present severe drawbacks. In this paper, we argue that neural networks low out-of-distribution detection performance is mainly due to the SoftMax loss anisotropy. Therefore, we built an isotropic loss to reduce neural networks uncertainty in a fast, scalable, turnkey, and native approach. Our experiments show that replacing SoftMax with the proposed loss does not affect classification accuracy. Moreover, our proposal overcomes ODIN typically by a large margin while producing usually competitive results against a state-of-the-art Mahalanobis method despite avoiding their limitations. Hence, neural networks uncertainty may be significantly reduced by a simple loss change without relying on special procedures such as data augmentation, adversarial training/validation, ensembles, or additional classification/regression models.

artificial intelligence, machine learning, neural network, (13 more...)

1908.05569

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hahn, Ernst Moritz, Perez, Mateo, Schewe, Sven, Somenzi, Fabio, Trivedi, Ashutosh, Wojtczak, Dominik

Omega-Regular Objectives in Model-Free Reinforcement Learning

arXiv.org Machine LearningSep-26-2018

We provide the first solution for model-free reinforcement learning of {\omega}-regular objectives for Markov decision processes (MDPs). We present a constructive reduction from the almost-sure satisfaction of {\omega}-regular objectives to an almost- sure reachability problem and extend this technique to learning how to control an unknown model so that the chance of satisfying the objective is maximized. A key feature of our technique is the compilation of {\omega}-regular properties into limit- deterministic Buechi automata instead of the traditional Rabin automata; this choice sidesteps difficulties that have marred previous proposals. Our approach allows us to apply model-free, off-the-shelf reinforcement learning algorithms to compute optimal strategies from the observations of the MDP. We present an experimental evaluation of our technique on benchmark learning problems.

artificial intelligence, machine learning, reinforcement learning, (19 more...)

1810.0095

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.67)